Providing context: Extracting non-linear and dynamic temporal motifs from brain activity

Approaches studying the dynamics of resting-state functional magnetic resonance imaging (rs-fMRI) activity often focus on time-resolved functional connectivity (tr-FC). While many approaches have been proposed, these typically focus on linear approaches like computing the linear correlation at a timestep or within a window. In this work, we propose to use a generative non-linear deep learning model, a disentangled variational autoencoder (DSVAE), that factorizes out window-specific (context) information from timestep-specific (local) information. This has the advantage of allowing our model to capture differences at multiple temporal scales. For the timestep-specific scale, which has higher temporal precision, we find significant differences between schizophrenia patients and control subjects in their temporal step distance through our model’s latent space. We also find that window-specific embeddings, or as we refer to them, context embeddings, more accurately separate windows from schizophrenia patients and control subjects than the standard tr-FC approach. Moreover, we find that for individuals with schizophrenia, our model’s context embedding space is significantly correlated with both age and symptom severity. Interestingly, patients appear to spend more time in three clusters, one closer to controls which shows increased visual-sensorimotor, cerebellar-subcortical, and reduced cerebellar-sensorimotor functional network connectivity (FNC), an intermediate station showing increased subcortical-sensorimotor FNC, and one that shows decreased visual-sensorimotor, decreased subcortical-sensorimotor, and increased visual-subcortical domains. We verify that our model captures features that are complementary to - but not the same as - standard tr-FC features. Our model can thus help broaden the neuroimaging toolset in analyzing fMRI dynamics and shows potential as an approach for finding psychiatric links that are more sensitive to individual and group characteristics.


Introduction
Complex dynamical systems like the brain often modulate internal representations across various timescales [1,2].For example, the complete cognitive process of thinking about dinner and the activity generated by populations of neurons effectuating the cognitive process occur at vastly different timescales and represent slowly varying representations (coarse-grained) to fast-varying (fine-grained) representations.
Disentangling a timeseries into slowly-varying and fast-varying representations is valuable in a variety of scientific fields, such as video representation learning [3], and healthcare (bio-)signals [4].It is known that the brain also exhibits a variety of intrinsic neural timescales both during task and resting-state [1] and that these intrinsic neural timescales differ for people diagnosed with schizophrenia [5,6].In previous work, it has been shown that representations throughout this temporal hierarchy can be modeled based on sensory inputs [2].However, separating representations at different timescales is harder when sensory inputs are not directly observable, occur at a higher frequency than the imaging modality, or are too complex to accurately encode.These cases require data-driven approaches that can separate slowly-varying from fast-varying representations.Especially given the differences in neuronal timescales for psychiatric subpopulations and the temporal resolution of fMRI data, developing approaches that can infer how representations change over different timescales can aid the discovery of dynamic motifs related to psychiatric disorders such as schizophrenia.
Finding slowly changing representations from neuroimaging data has mostly been explored through functional connectivity (i.e., temporal coherence among isolated regions) or functional network connectivity (FNC) (i.e., temporal coherence among overlapping networks) of the brain [7,8].These methods have been extended to capture dynamic changes [9][10][11].One widely used approach, which uses sliding windows to estimate the functional connectivity of the brain in specific windows [12][13][14], has been used to show dynamic properties that are linked to schizophrenia [15].Although temporal coherence or correlation is an interpretable way of representing a window of brain activity, it is also limiting, since there may be more complete ways of summarizing activity in a particular window.Using the framework of separating slowly-varying from fast-varying representations, we can generalize the idea of representing a window with a single embedding to more abstract embeddings.We call these more abstract representations context motifs since they provide context about the brain activity in that window.Not only do these context representations contain information about longer periods, but the windowing also acts as a low-pass filter that potentially helps denoise some of the underlying information contained in the fMRI data, especially given the low-frequency nature of the BOLD signal, and the relationship between its low-frequency signal and schizophrenia [16].
Our work shows that separately modeling the individual timesteps (local information) and the window as a whole (context information) leads to window embeddings that are more linearly separated between schizophrenia patients and controls.We assess the reliability of our model across different random seeds and find that the reliability between the two main models we propose differs based on the configuration of the models.Mainly, the models are computationally reliable, where the computational reliability of models with fewer context dimensions is more reliable than June 27, 2024 2/18 models with more context dimensions.Then, to understand how our embedding space abstracts away from a functional embedding space, we compare distances in a functional connectivity embedding space with distances in our proposed model's embedding space.
We also find that the temporal distance in the local embedding space is significantly lower for schizophrenia subjects than for control subjects.This result is much more pronounced in our models than in the original data.Lastly, a deeper analysis of a model that is computationally reliable and easy to visualize shows three main clusters of schizophrenia patients in the context space.One cluster with windows closer to control subject windows, one middle cluster that slightly overlaps with control subjects, and one cluster that is mostly completely separated from control subjects.To understand what these clusters represent, we visualize them as connectivity patterns and find interesting patterns relating to visual-sensorimotor, subcortical-cerebellar, subcortical-sensorimotor, and cerebellar-sensorimotor.Specifically, the windows in the separated cluster, including reduced visual-sensorimotor and increased subcortical-sensorimotor connectivity, represent unique motifs that are seen in these most separated windows from schizophrenia patients, thus providing a powerful and more fine-grained way to identify functional patterns that are linked to the disorder.

Related work
To our knowledge, there are no methods that explicitly try to disentangle local and context representations from brain data.Apart from approaches based on sliding windowed Pearson correlation, which is commonly used to derive brain states (e.g., dFC/dFNC), the most similar model to ours uses a 1D convolutional autoencoder to learn embeddings for windows of fMRI activity [17].Our work extends this model by learning embeddings for both the window and the individual timesteps in the window, and the interaction between the context and individual timestep embeddings.This allows us to separate local information from the context embeddings.Our work is inspired by progress in adjacent fields [4], and we adapt both the factorized (independent) and unfactorized version of the disentangled sequential autoencoder (IDSVAE and DSVAE, respectively) [3] to brain data.

Model definition
Let x 1:T ∈ R N ×T denote the multivariate timeseries of a single subject, with T the number of timesteps and N the number of features.In our case, the features are independent component analysis (ICA) component time courses.Our goal is to learn a generative model that can be generalized to new subjects in a test set.The way we factorize our generative model is to separate local information for each timestep, and context information for each window.A graphical representation of this factorization is shown in Figure 1.
To achieve this factorization, we first separate the dataset into (overlapping) windows with a window size of 30 timesteps, which is equivalent to 60 seconds for the fMRI data we use in this study.For a window of data x wj , where j indexes over each window, we learn two separate encoders.One context encoder, ϕ context  timestep from the fMRI data x t .Since the context embedding is learned for a specific window, it remains the same for all timesteps in a window.We train the model using a reconstruction loss, and a variational loss [18], which acts as regularization on the context and local embeddings, as follows. 1 Where β and γ are hyperparameters that weigh the regularization terms in the total loss.Note that the KL-divergence D KL (•) pushes the distributions parameterized by the context and local encoder to be close to a zero-mean unit-variance normal distribution p(z).

Data and resources
In this work, we use the function bioinformatics research network (fBIRN) phase III data, with over 300 schizophrenia patients and controls [19].The demographics for control subjects and schizophrenia patients are described in  1.Data sample demographics.Note that AP refers to anti-psychotic medication, and AD refers to anti-depressive medication.Thus, AP and AD in this table refer to the percentage of patients taking anti-psychotic and anti-depressive medication, respectively.PANSS is a symptom scale for schizophrenia.We show its positive, negative, and composite scores.
As a preprocessing step, we obtain ICA timeseries from the rs-fMRI data using the fully automated NeuroMark pipeline [21].Specifically, we implement NeuroMark, using the NeuroMark fMRI 1.0 template (the template is released in the GIFT software at http://trendscenter.org/software/gift and results in 53 ICA components and timecourses from each subject.
The necessary code to reproduce the proposed model work will be made available on GitHub.Each model is trained with PyTorch 2.0 [22] across 100 different hyperparameters using Ray Tune [23], new hyperparameters are selected with Optuna search [24], and the best hyperparameter setting is used in the final evaluation of each model.We use the validation mean squared error for each model to evaluate hyperparameters.The hyperparameter ranges for each model are described in June 27, 2024 5/18 Appendix 1.Each model is trained with a GeForce RTX 1080, and we train each model with each of the following random seeds [42,1337,1212,9999], to obtain a statistical range across shifts in random initializations.

Experiments
We have devised a series of experiments to verify that our method can find embeddings that are distinctly different from wFNC and are clinically relevant.First, in Section Window classification, we verify whether embeddings obtained with our method are linearly separable.We should be able to classify whether a window of fMRI activity is from a schizophrenia patient or a control subject.Second, in Section Reliability

Window classification
To test how well windows of fMRI activity from schizophrenia patients and control subjects are linearly separable in a latent space, we compare our proposed model to two baseline models as well as the widely used wFNC approach.The two baseline models we use are one model that only uses local embeddings (LVAE), and a model that only uses context embeddings (CO).The latter model has been proposed in previous work [17], and we follow their design choices by using a convolutional encoder and decoder.However, the LVAE model is the same as our model, but only uses local embeddings, instead of adding the context embedding.

Reliability analysis
It is important for neural network models to converge to the same solution, especially if we want to use them to make clinical inferences.To test whether different instantiations of our DSVAE and IDSVAE models converge to the same embedding space, we perform a reliability analysis.In essence, we are testing the computational reproducibility of the method, but across different random seeds.The results of this reliability analysis are shown in Figure 3.We train each model across 4 different seeds and then embed the full dataset.Then, for each combination of seeds, we fit a linear regression model to predict the location of each embedding in another seed's embedding space, on the concatenated training and validation set.We then use the average R-squared score across each combination of seeds as a metric of similarity across seeds.Namely, if the context embedding space is the same under linear transformations across seeds, we believe that the model converges to the same solution irrespective of the initialization of the network.
In Section Window classification we saw that the IDSVAE model outperformed the DSVAE model for LS=4,CS=2 and LS=8,CS=4, and in Figure 3 we can see that the IDSVAE versions of these models have low reliability.Moreover, we can see a trend in the reliability plots, where higher is better, that larger context dimensionality generally leads to worse reliability.This is largely expected since higher-dimensional spaces are essentially 'bigger', and there are thus more variations of embedding spaces that can lead to a good generative model of the data.Furthermore, in most cases, the DSVAE model is better than the IDSVAE model.Especially the LS=2,CS=2 model both has a high classification accuracy, see Figure 2, good computational reproducibility, see Figure 3, and important for the next sections, is easy to visualize because both the local and context embeddings are 2-dimensional.Hence, we will use this model for further analysis in the upcoming sections.

Manifold comparisons
Given our model's improved performance over wFNC in Section Window classification, we look at the similarity between our model's embedding space and that of wFNC.To formalize this notion, we compare the similarity between normalized distances in the June 27, 2024 7/18 wFNC space and our model's embedding space.First, we embed all of the windows in our model's embedding space and compute their wFNCs.Then, we calculate the distance matrix between the windows in both the wFNC space and our model's embedding space using the Euclidean distance.Lastly, we train a linear regression model to predict distances between windows in our model's embedding space from distances between windows in the wFNC space.The R-squared score is 0.19, this indicates that our model does not learn features that are essentially similar to connectivity features.We additionally provide a visualization of wFNC features based on our model's embedding space in Appendix 2.
As a measure of dynamics on the manifold, we also look at the distance traveled with each timestep, and how this measure may differ between schizophrenia patients and control subjects.For the context embeddings in our model, we do not find a significant difference.However, for the local embeddings, we find that the distance between timesteps for schizophrenia patients is significantly lower than for control subjects.We also compare our results to the local-only model with the same latent dimension, and to the original data.To calculate the z-score of the aforementioned result, we first calculate the Euclidean distance between each timestep.Then, we compute the difference between the average distance for schizophrenia patients and control subjects.To find the z-score for each model, we randomly permute the labels of the subjects 10000 times and compute the difference between the average distances for each group.June 27, 2024 8/18 Lastly, we compute the z-score for the actual mean difference based on the actual labels using the mean and standard deviation of the permuted differences.For the DSVAE model, we find a z-score of −5.3, for the LVAE model a z-score of −5.8, and for the original inputs −2.1.The temporal distances are significantly smaller for schizophrenia patients in the DSVAE and LVAE models as compared with the original inputs.

Cluster analysis
After verifying our model's computational reproducibility, we visualize the embedding space of a highly computationally reproducible version of our model.For the visualization, we use the LS=2,CS=2 version of our model.Since the context embeddings are 2-dimensional, we can easily visualize the embedding space of our model, as shown in Figure 4.In the patient-only version of the plot, there seem to be three clusters, which roughly seem to correspond to windows that are more similar to windows from control subjects (top), windows that are on the boundary (middle), and windows that are completely dissimilar from any control subject windows.We found the cluster centers using K-means clustering [25]; the clusters are black squares in the leftmost subfigure in Figure 4.The cluster that is most separated from control windows visually seems to represent a cluster of older subjects, with a low CMINDs [26] score, as shown in Figure 4. To verify the aforementioned visual relationship between the most separated cluster of windows from schizophrenia subjects, and age and CMINDs score, we perform statistical tests.First, we calculate a two-sided t-test between subjects diagnosed with schizophrenia who do have a window that is present in the cluster, and subjects diagnosed with schizophrenia who do not.We find both significant differences for age (p < 5E − 5, t = 4.98) and CMINDS score (p < 0.005, t = −3.21).Lastly, we calculate the number of times a window from a subject appears in the cluster and calculate the Pearson correlation between the time spent in the cluster and the age/CMINDS score for schizophrenia patients.This analysis is similar to dwell time analyses [27].Again, we find significant differences for age (p < 5E − 5, p = 0.38) and CMINDs score (p < 0.05, p = −0.19).

Cluster visualization
To interpret the clusters and what types of motifs they represent in an interpretable format, we visualize the averaged wFNC matrix for each cluster.To find the wFNC matrix for each cluster, we compute the wFNC representation for each point in the embedding space, and average the wFNC's of points that belong to the same cluster.
Lastly, to highlight the differences in the clusters, we subtract the average schizophrenia wFNC from each of the clusters.These final wFNC representations of the clusters are shown in Figure 5.The three schizophrenia patient clusters visualized using functional connectivity matrices.To create the visualizations, we take all the windows belonging to a cluster and average their wFNC matrix.
In Section Cluster analysis we saw that the three clusters essentially encode a gradient from windows that are similar to control windows in Cluster 1, to windows that are not necessarily similar to control windows, but also not easily separable in Cluster 2, and almost completely separated windows from schizophrenia patients in Cluster 3. Cluster 1, which is a cluster closer to controls, shows increased visuo-motor, cerebellar-subcortical, and reduced cerebellar-sensorimotor functional network connectivity (FNC).Then, the middle cluster (Cluster 2), is generally less different from the average schizophrenia wFNC than Cluster 1 and 3, but shows increased subcortical-sensorimotor FNC, whereas Cluster 1 and 3 do not.Lastly, the cluster that is most separated from the control windows, shows reduced visuo-sensorimotor, increased subcortical-motor, and increased subcortical-visual connectivity.

Discussion
In this work, we proposed a model that can be used as an alternative to wFNC, which is a method that is commonly employed to analyze the dynamics of fMRI activity.In Section Window classification, we first show that both of our proposed methods are better than both the baseline models and wFNC in separating windows of rs-fMRI activity from schizophrenia patients.These results indicate that by factoring out timestep-specific and window-specific information, we obtain context embeddings that contain information about the subject's diagnosis.In Section Introduction we hypothesize that factoring out context embeddings potentially helps the model focus on the more low-frequency signal in the rs-fMRI timeseries, potentially de-noising some of the signals.Spontaneous low-frequency fluctuations in the BOLD signal have previously been linked to schizophrenia as well [16].Moreover, it is unlikely that individual timesteps are directly linked to psychiatric disorders.Instead, our context embeddings obtain information from a larger set of timesteps.Since windows capture longer periods of fMRI activity, they are more likely to contain information about cognitive function.
June 27, 2024 10/18 Furthermore, the dynamics within a window may be important in recognizing potentially dysfunctional motifs of fMRI activity.These dynamics can only be captured in a window of activity, as opposed to a single timestep.To verify that our results are computationally reproducible, we then calculate how similar embedding spaces across initializations are in Section Reliability analysis.Our results show that the DSVAE model is more often computationally reproducible than the IDSVAE model.Together with the slightly better performance in Section Window classification across the local and context dimensionality, we decide to use a DSVAE model with local size = 2 and context size = 2 for further analysis.One larger trend we observe for the computational reproducibility results is that larger context sizes lead to less reproducible models.This observation makes sense because higher dimensionality, paired with a KL-divergence regularization, leads to a 'larger' overall space to span.Thus, across random initializations, the model can find 'good' solutions that have different latent spaces, this is a consequence of underspecification [28].Lastly, especially models with smaller context sizes are computationally reproducible, with R-squared scores between 0.85 − 0.90.Given the computational reproducibility of the DSVAE model with 2-dimensional local and context embedding spaces, we further analyzed its embedding space in subsequent sections.
In Section Manifold comparisons we additionally find that when we factorize out the context embeddings, the local embeddings of schizophrenia patients take significantly smaller steps in the local embedding space than the local embeddings for control subjects.This difference is also significantly larger for our and the local-only embedding model than for the original data.This means that the non-linear embedding of timesteps helps uncover this difference for people with a schizophrenia diagnosis.Earlier work [29] reports a similar result, where resting-state fMRI meta-states show reduced dynamism in schizophrenia patients.Although the previous work uses functional connectivity and meta-states, we observe a similar result for local (timestep) embeddings because timesteps are temporally encoded significantly closer to each other for schizophrenia patients than for controls.This indicates a smaller dynamic range for schizophrenia patients because with each step they cover significantly smaller distances than control subjects.Then, to verify that the features our model learns are unique, we compare our model's embedding space to that of the wFNC's embedding space.We find that our model learns different features than wFNC does.This means our model provides complementary results to wFNC, and can thus help future research in uncovering motifs that are not connectivity-based.This is potentially important because correlations are invariant to scaling, temporal permutation(if the permutation is the same for all inputs), and the addition of a constant, connectivity features from correlations are highly specific.Although connectivity is interpretable, we propose a method that can learn more abstract embeddings that are more indicative of schizophrenia, and potentially other psychiatric diagnoses.
Lastly, in our cluster analysis, we found a cluster of windows from schizophrenia subjects with significantly lower CMINDS scores and higher ages.The relationship between lower CMINDS scores and higher age has been reported previously [26].A lower CMINDS score indicates higher symptom severity, and in the results presented in Section These subfigures show a visualization of the context embeddings for the DSVAE model with LS=2,CS=2.The first subfigure from the left shows three patient clusters, the second subfigure both patients and controls, the third colors context embeddings based on the subject's age, and the last subfigure colors the context embeddings based on the subject's cognitive score, CMINDs., the y-axis captures symptom severity, and the age of the subject.In [26]  To interpret how the three clusters differ from each other in terms of the dynamic motifs that they represent, we visualized the corresponding wFNC of each cluster.The three schizophrenia clusters roughly correspond to 1) windows that are similar to control windows, 2) a cluster that is in between schizophrenia patients and control subjects, and 3) a cluster with windows that are almost entirely separated from control subject windows.The first cluster shows increased visual-sensorimotor, cerebellar-subcortical, and reduced cerebellar-sensorimotor functional network connectivity (FNC), which are connectivity regions that align with previous connectivity-based results between control subjects and schizophrenia patients [30].The second cluster shows increased subcortical-sensorimotor FNC, which is intriguing because it is only increased in this cluster, and more generally decreased in the other two clusters, even though the other two clusters are the most dissimilar.These findings align with but extend, previous work [31] that found increased functional connectivity between the thalamus and sensorimotor network for schizophrenia patients with psychomotor excitation, as opposed to decreased connectivity for schizophrenia patients with psychomotor inhibition.Lastly, the most separated cluster shows decreased visual-sensorimotor, decreased subcortical-sensorimotor, and increased visual-subcortical domains.The decreased subcortical-sensorimotor connectivity is observed in the cluster that is most similar to control subjects as well, but in combination with increased visual-subcortical and decreased subcortical-sensorimotor connectivity, the motif reflects highly separable windows.These increases and decreases are with respect to the rest of the schizophrenia patients, and thus do not reflect a decrease or increase with respect to control subjects necessarily.Interestingly, increased subcortical-visual connectivity, in this case, is more separable from control subjects because it is hypothesized that the impairment of the subcortical-visual pathway may be an important early indicator of schizophrenia progression [32].

Future work
The model we propose in this work can in future work be expanded to additional datasets and other psychiatric disorders.In its current form, there are no assumptions about the structure of the data, except that it is possible to create temporal windows, and can thus be used for both task and resting-state fMRI data or even EEG data.

Conclusion
In this work, we tried to propose a more abstract way in which windowed rs-fMRI can be summarized that can complement popular approaches like wFNC.We first show that our proposed models outperform baselines and wFNC in separating windows from schizophrenia patients and control subjects.For both of our proposed models, we assess their computational reliability, which is an important aspect of biomedical machine-learning models.With a computationally reliable and interpretable model, we then dig deeper into how the trained model differs from wFNC in the features that it learns, and we additionally find a reduced dynamic range for schizophrenia subjects in the local embedding space.Moreover, for individuals with schizophrenia, we find that a separated cluster of windows from schizophrenia patients in our model's context embedding space is significantly correlated with both age and symptom severity.In general, patient windows appear to be split into three clusters, one with windows more similar to controls, and with increased visuo-sensorimotor, cerebellar-subcortical, and reduced cerebellar-sensorimotor functional network connectivity (FNC).For the station that slightly overlaps with control subjects, we find increased subcortical-sensorimotor FNC, which is not found in the other two clusters and is an interesting result that may indicate schizophrenia patients with psychomotor excitation.The last cluster, which is most separated from control subjects shows decreased visuo-sensorimotor, decreased subcortical-sensorimotor, and increased visuo-subcortical domains.Our proposed model can thus help broaden the neuroimaging toolset in analyzing fMRI dynamics and shows potential as an approach for finding psychiatric links that are more sensitive to individual and group characteristics.

Appendix B: Geometric manifold comparison visualization
To visually verify that our model's embedding space is not capturing wFNC features, we visualize our model's embedding space using the Jonker-Volgenant algorithm [33] in Figure 6, we can see that similar wFNCs are not per se close together in our model's embedding space.Since correlation is invariant to scaling, temporal permutation (if the permutation is the same for all inputs), and the addition of a constant, connectivity features from correlations are highly specific.Indeed, we can visually see that our model captures different features than wFNC does.Thus, our method can uncover novel yet complementary motifs for this schizophrenia population from rs-fMRI data.
June 27, 2024 15/18 . These encoders have learnable parameters θ, and the context encoder takes the full window as input z c = ϕ context θ (x wj ), where we refer to z c as a context embedding.For the DSVAE model, we concatenate this context vector z c with the window x wj to form the input to the local encoder: z 1:W = ϕ local θ ([z c x 1:W ]).For the unfactorized version of the DSVAE [3] (IDSVAE) that we also use in this paper, the June 27, 2024 3/18

Fig 1 .
Fig 1.An abstract depiction of our model, for the independent version of the model, the global representation is not concatenated inside the local encoder.

Fig 2 .
Fig 2. The window classification accuracy for two of our proposed models (DSVAE, IDSVAE), windowed FNC (wFNC), and two baseline methods: context-only (CO), and local-only (LVAE).Our proposed methods outperform all other methods, experiments are performed across 4 seeds.

Fig 3 .
Fig 3.Each subfigure shows a different local size (LS) and context size (CS) configuration, where the reliability of the model across different initializations is measured in R-squared.In most cases, the DSVAE model is significantly more reliable than the IDSVAE method, except for LS=4,CS=4, LS=8,CS=4, and LS=8,CS=8.Moreover, reliability decreases with a larger dimensionality of the context space, likely increasing the dimensions essentially increases the size and thus the number of equivalent solutions of the space.

Fig 4 .
Fig 4.These subfigures show a visualization of the context embeddings for the DSVAE model with LS=2,CS=2.The first subfigure from the left shows three patient clusters, the second subfigure both patients and controls, the third colors context embeddings based on the subject's age, and the last subfigure colors the context embeddings based on the subject's cognitive score, CMINDs.

Fig 5 .
Fig 5.The three schizophrenia patient clusters visualized using functional connectivity matrices.To create the visualizations, we take all the windows belonging to a cluster and average their wFNC matrix.

Figure 6 .Fig 6 .
Figure 6.The figure is created by first embedding each of the windows using our LS=2,CS=2 model, the same model we used in Section Cluster analysis.Then, we used the Jonker-Volgenant algorithm to create from the locations of points in the embedding space, note each point corresponds to a window in the rs-fMRI signal.Lastly, we visualized each point/window as its wFNC, this allows us to see whether wFNC patterns are close together in our model's embedding space.If this is not the case, then our model is likely capturing complementary features that are important (see Section Window classification), and interesting (see Section Cluster analysis).In

Table 1
[20] of these consortiums records diagnosis, age at the time of the scan, gender, illness duration, symptom scores, and current medication, when available.The inclusion criteria were that participants were between 18 and 65 years of age, and their schizophrenia diagnosis had to be confirmed by trained raters using the Structured Clinical Interview for DSM-IV (SCID)[20].All participants with a schizophrenia diagnosis were on a stable dose of antipsychotic medication; either typical, atypical, or a combination for at least two months.Each participant with a schizophrenia diagnosis was clinically stable at the time of the scan.The control subjects were excluded based on current or past psychiatric illness or in case a first-degree relative had an Axis-I psychotic disorder.These diagnoses were based on the SCID assessment.Written informed consent from all study participants was obtained under protocols approved by the Institutional Review Boards at each consortium site.

Table
analysis, we evaluate how reliable each method is across initialization seeds.Models with similar embeddings across initialization are more robust and thus more impactful.Clinical inferences can not be made based on models that depend too much on the random seed because it is hard to decide which random seed is the 'correct' model to derive inferences from.Third, in Section Cluster analysis, we analyze what clusters we find in our model's embedding space and how these clusters relate to demographic variables and cognitive scores.Lastly, in Section Manifold comparisons, we analyze how the wFNC embeddings differ from the embeddings our model produces.One of the goals of our work is to generalize wFNC embeddings, and we want to analyze whether the embeddings are significantly different by looking at distances between windows in both spaces and how well-matched the distances are between our model's embedding space and that of wFNC.
it was found that the CMINDS score does not exhibit significant group-by-age interactions.It is thus not entirely clear how age and CMINDS score interact with schizophrenia diagnosis.However, our finding that there is a cluster This also means future work can utilize window size generalizations, such as hierarchical windows or windows that vary in size across the timeseries.It is also important in future work to develop interpretable ways of visualizing the data; although we use wFNC matrices to visualize the clusters in this work, we also find that our model learns new features that are not captured by wFNC, so there are potentially other ways to visualize our model's embedding space.The complementary features our model learns open up a different space in which we can study the rs-fMRI dynamics of psychiatric patients, and potentially enable a deeper understanding of psychiatric disorders, such as schizophrenia.With the heterogeneous nature of certain psychiatric disorders, including schizophrenia, it is important to understand how brain activity differs across a variety of features.In this work, we have started exploring how schizophrenia patients vary across the features our model learns, but we believe our results warrant further clinical research.